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ABSTRACT 

The one-point probability distribution function (pdf) of the large-scale density field is 
an important tool to follow the evolution of cosmological structures. In this paper we 
present a new model for this pdf for all regimes and all densities, that is from linear to 
highly non-linear scales and from rare voids up to rare high densities. This is probably 
the simplest model one can build which is consistent with normalization constraints 
and known rigorous results (the quasi- linear regime and the rare void limit) . It is fully 
parameterized by the non-linear variance and skewness. We obtain a good agreement 
with N-body data from realistic cosmological simulations of the VIRGO consortium 
and we find that it works significantly better than previous models such as the lognor- 
mal model or the Extended Perturbation Theory (EPT) . We explain this success as a 
result of the tight constraints onto the pdf provided by these consistency conditions. 
We also point out that while the Lagrangian dynamics of typical fluctuations is quite 
complex the statistical outcome seems rather simple. This simple model should be 
useful for studies which require a realistic and convenient description of this pdf. 

Key words: Cosmology: theory - large-scale structure of Universe Methods: ana- 
lytical, statistical, numerical 



1 INTRODUCTION 

In usual cosmological scenarios the large-scale structures ob- 
served in the present universe have formed by gravitational 
instability from small initial density fluctuations. Moreover, 
in a CDM-like model the system is governed at large scales 
by the collisionless gravitational dynamics of the dark mat- 
ter which builds a wide variety of structures, from rare ex- 
panding voids to filaments and almost spherical massive 
clusters. Besides, in such models where the amplitude of 
density fluctuations grows at smaller scales one observes 
the build-up of a hierarchical clustering process. The scale 
which marks the transition to non-linearity increases with 
time so that increasingly large and massive structures turn 
non-linear and collapse as time goes on and smaller objects 
which formed earlier become embedded within larger enti- 
ties. Then, they may follow a complex dynamics as mergings 
and disruptions take place. This builds a complex network 
(or cosmic web) with a wealth of structures and substruc- 
tures which is quite difficult to model in details. Then, at 
small scales where the baryonic density is sufficiently large 
(e.g., within collapsed halos) radiative processes like cooling 
come into play and further enhance the gas density through 



the subsequent baryonic collapse they imply. The baryonic 
matter may become a dominant component and eventually 
form stars and galaxies. Therefore, the large scale distri- 
bution of matter is closely related to the observed distri- 
bution of astrophysical objects (galaxies, clusters, Lyman-a 
clouds, etc.) although on very small scales the relationship 
can become very intricate. Thus, an important goal of obser- 
vational cosmology is to understand the formation of these 
large-scale structures. 

A first step is to investigate the distribution of matter 
on large scales which is governed by gravity (note that the 
density field is also directly probed by weak gravitational 
lensing surveys as opposed to galaxy surveys which probe 
the stellar content of the universe) . One may then study the 
mass function of specific objects, like just-virialized halos, 
following Press & Schechter (1974). Another tool is provided 
by the many-body density correlation functions which probe 
in more details the structure of the density field (e.g., Peebles 
1980). In this article we shall focus on a simpler statistics: 
the one-point probability distribution function (pdf) of the 
density V(pr). Although it discards the angular behaviour 
of the many-body correlations it contains their amplitude 
at a given scale R and it allows one to follow the evolution 
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of gravitational clustering with time. Indeed, by going from 
low to high densities one successively probes rare voids, fila- 
ments, typical halos and rare massive halos (identified with 
clusters at z = 0). We shall assume here that the initial 
conditions are Gaussian, as expected from simple inflation- 
ary scenarios. Then, at large scales or at high redshifts the 
pdf goes to a simple Gaussian. Next, as non-linearities de- 
velop and gravitational clustering builds up deviations from 
Gaussianity appear and become increasingly important. 

The first deviations from Gaussianity can be derived 
from analytical methods in the so-called quasi- linear limit, 
from perturbative expansions (Bernardeau 1992, 1994a) or 
steepest-descent methods (Valageas 2002a). In a similar 
fashion, the statistics of extreme underdensities (rare voids) 
can also be obtained for any regime (Valageas 2002a, see also 
Bernardeau 1994b). However, it has proved difficult to model 
the evolution of the full pdf V{pn) into the linear regime. 
It was proposed to use as a simple approximation a lognor- 
mal pdf (Coles & Jones 1991) which is always well-behaved 
as the density is positive (while the Gaussian pdf extends 
to increasingly negative densities in the non-linear regime) . 
However, this model is inconsistent with the known results 
and it appears to show significant discrepancies with de- 
tailed numerical simulations (Bernardeau & Kofman 1995). 

A more theoretical approach has been to investigate a 
Lagrangian method where one tries to follow the evolution 
of individual density fluctuations. This can be expressed in 
terms of a mapping between the linear density contrast and 
the actual non-linear density contrast. As seen in Valageas 
(1998) and Protogerros & Scherrer (1997), using the map- 
ping given by the spherical dynamics yields the exact quasi- 
linear limit and provides good results up to a rms linear 
density fluctuation a 1. This can be understood in sim- 
ple terms from the steepest-descent method developed in 
Valageas (2002a). A detailed study of the spherical model, 
compared with perturbation theory, various local approx- 
imations and numerical simulations, is presented in Fos- 
alba & Gaztanaga (1998). A different approach based on 
the excursion set formalism which also includes the spheri- 
cal model is described in Sheth (1998). However, the highly 
non-linear regime where mergings and tidal disruptions play 
a key role seems beyond the reach of such approaches (at 
least so far). Other approaches to tackle non- linear gravity 
include semi-analytical techniques which approximate the 
Vlasov-Poisson dynamics by simpler systems of equations 
(for a comparison of such models in the weakly non-linear 
regime see Munshi et al. 1994). 

Since following the dynamics of individual density fluc- 
tuations appears problematic in the non-linear regime (be- 
cause of mergers, etc.) other approaches have been advo- 
cated which directly consider the statistical properties of the 
system. For instance, assuming specific scaling-laws for the 
many-body correlation functions inspired from the stable- 
clustering Ansatz (see Peebles 1980), Balian & Schaeffer 
(1989) were able to derive the many properties they imply 
for the density field. Note that the formalism developed in 
that work can be used in more general cases (like the use 
of cumulant generating functions which we also take advan- 
tage of in this paper). Then, Scoccimarro & Frieman (1999) 
devised a model (HEPT) to obtain the amplitude of these 
many-body correlations in the highly non-linear regime (as- 
suming again that they obey the same scaling laws). They 



proposed to extrapolate to the non-linear regime the tree- 
level perturbation results through a specific limit and ob- 
tained a good agreement with numerical simulations. 

A somewhat more empirical attempt to follow the evo- 
lution of gravitational clustering was proposed in Colombi 
et al. (1997). Working also with the pdf V(p R ) itself (or 
more precisely its cumulant generating function ip(y)) they 
modeled the non-linear pdf through a simple empirical mod- 
ification of the quasi-linear prediction, which they dubbed 
Extended Perturbation Theory (EPT). More precisely, they 
suggested to use the functional shape derived in the quasi- 
linear limit but to treat the local slope n of the power- 
spectrum as a free parameter in order to extend the model 
into the non-linear regime. They obtained in this manner a 
reasonable agreement with numerical simulations. In this 
article we follow the same approach as we directly work 
with the generating function tp{y) but in addition to the 
quasi-linear limit we also require that it satisfies the rare- 
void limit. Then, we build the simplest possible model which 
obeys all these constraints. As EPT it is fully parameterized 
by the non-linear variance and the skewness. Using standard 
models for these two quantities we show through a compar- 
ison with numerical simulations that our approach provides 
a good prediction for the pdf in all regimes. In particular, it 
works significantly better than both the lognormal approxi- 
mation and EPT. 

This paper is organized as follows. In section 2 we 
present our model and we recall both the EPT and log- 
normal models. Then, we compare our results to N-body 
simulations in section 3 and we conclude in section 4. We 
also explain the numerical implementation of our model in 
appendix A and we discuss its robustness in appendix B. 

2 MODELS FOR THE DENSITY 
DISTRIBUTION FUNCTION 

2.1 General framework 

In this article we wish to build a simple model for the one- 
point probability distribution function (pdf) of the density, 
which describes the formation of large scale structures in the 
universe through gravitational instability. We first define the 
overdensity at scale R as the mean overdensity over a spher- 
ical cell of radius R and volume V (i.e. we use a spherical 
top-hat filter): 

n =J v fef = 1+SR , (1) 

where p is the mean density of the universe, 8r is the den- 
sity contrast at the same scale R and x is the comoving 
coordinate. The first few cumulants of the overdensity obey: 

<1) = 1, <PK> = 1, (pl) c =I and ^ = S 3 . (2) 

Here we noted (..) the average over the initial Gaussian con- 
ditions (or over space by ergodicity) while the subscript c 
refers to "cumulants" (or connected parts) as opposed to 
simple moments. We also introduced in eq.(2) the variance 
£ and the skewness 53 while the first constraint in (2) sim- 
ply means that the pdf V{pr) is normalized to unity. A 
Gaussian density field is fully defined by its variance and all 
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higher-order cumulants vanish (in particular S3 =0). As is 
well-known, although we assume the initial conditions to be 
Gaussian the non-linear gravitational dynamics will grad- 
ually build non-Gaussianities which we need to take into 
account. Therefore, we introduce the cumulant generating 
function (p(y) defined by: 



<p(y) = 



S P y p , 



(3) 



p=0 



with: 



(P P R ) 



S =0, S 1 =S 2 = 1 and for p > 3 : S p = . (4) 

Thus, all cumulants can be obtained from ip{y) (assuming 
the series in eq.(3) converges). Moreover, this generating 
function is simply the logarithm of the Laplace transform 
of the pdf V(p R ): 



-¥>(»)/£ 



which can be inverted as: 



V{ P R) 
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-zoo 



d V_ e [p R y-v(v)]/S_ 



2-KiS, 



(5) 



(6) 



One advantage of working with the generating function ip(y) 
rather than with the pdf V(pn) itself is that it provides a 
very simple and flexible tool to parameterize the non-linear 
evolution of gravitational clustering. For instance, the four 
normalization constraints (2) are automatically satisfied if 
we ensure that the Taylor expansion at y = of <p{y) is: 
f{y) = V ~ ?/ 2 /2 + Szy 3 /6 — ... It is easier to implement such 
local conditions than the integral constraints (2) onto the 
pdf V(pr) itself. However, one must also check that the pdf 
obtained from eq.(6) is positive. More precisely, we have: 



V(p R ) > for p R > and P(p R ) = for p R < 0. 



(7) 



As seen from eq.(6), the second constraint in (7) is satisfied 
as soon as <p{y) is regular over the right half-plane Sft(y) > 
and grows more slowly than \y\ for 5ft(y) — ► +00. 



2.2 Quasi-linear regime 

The functions V{pr) and tp(y) can be derived in a rigorous 
manner at large scales or at early times where the amplitude 
of the density fluctuations goes to zero. In this quasi-linear 
regime, it is convenient to introduce the rms linear density 
fluctuation a: 



(8) 



where 8l,r is the linear density contrast smoothed at scale 
R. Of course, at large scales we have £/cr 2 — » 1. Then, the 
appropriate generating function (p{y) is now defined as: 



/*oo 

/ dp R , 
Jo 



-PRV/o 



-P(pr) 



Thus, it is related to ip(y) defined in eq.(5) by: 



<p(y) 



(9) 



(10) 



Again, at large scales we have <fi(y) — * $>(y). It turns out that 
the function ip(y) has a finite limit at fixed y in the quasi- 
linear limit (T*^ — ► which can be computed analytically. 
This may be done through a perturbative expansion of the 
density field (Bernardeau 1992, 1994a) or a steepest-descent 
method (Valageas 2002a). In both cases, one obtains the 
cumulant generating function <p{y) as the solution of the 
implicit system: 



<p(y) = y<(?) + - 

f = -y C'(f) 



(11) 
(12) 



where the function £(f) is closely related to the spherical 
collapse. Indeed, it is defined by the implicit relation: 



C(f) = T 



. z [ar)^R\ 



(13) 



where the function T describes the mapping from the linear 
density contrast Sl to the non-linear overdensity pR given 
by the spherical collapse: 



Pr — J-{5l) before shell-crossing. 



(14) 



In the case of a critical-density universe, the function !F{8l) 
can be written in terms of trigonometric and hyperbolic 
functions (Peebles 1980). However, it turns out that the de- 
pendence of T on the cosmology is quite weak and that very 
good results can be obtained by using its limit for f2 m — * 0, 
with Qa = 0, (see Bernardeau 1992): 



(-§*) 



-3/2 



(15) 



Moreover, the power-law behaviour J-{5l) oc (— Sl)~ 3 ^ 2 for 
Sl — » —00 is exact for any values of f2 m and J1a. The implicit 
equation (13) can be simplified for a power-law linear power- 
spectrum Ph{k) oc k n . We shall assume hereafter that wc 
have — 3 < n < 1, which agrees with usual cosmological 
models like CDM power-spectra. In this case, we have: 

(16) 



a 2 (R) <xR- (n+3) and also: n + 3 = -' lb " 7 



dlni? 



and eq.(13) writes: 

c = t r-f r (n+3)/6 l 



(17) 



Using the expression (15) for T we obtain the inverse f(C) 
as: 

-(0 = -fc (n+3)/6 + fc (n - 1)/6 . (18) 

This equation fully defines <fi(y) = *fi(y) in the quasi-linear 
regime, through eqs.(ll)-(12), whence the pdf V(pr) from 
cq.(6). We must note that in studies which focus on the 
quasi-linear regime one often considers the density contrast 
6r rather than the overdensity pR. This yields an additional 
factor — 1 to T and £ while (p(y) is defined with Si = 0. 



2.3 Rare voids 

In addition to the quasi-linear regime, one can also derive 
exact results for any a 2 in the limit of extreme underdensi- 
ties. This corresponds to the limit of large positive y and 
t and small £ at fixed a. This is again done through a 
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steepest-descent method which is appropriate to rare events 
(Valageas 2002b). The spherical saddle-point of the relevant 
action is actually the same as the one obtained in the quasi- 
linear regime so that the generating function 0(y) is still 
given by the implicit system (11)-(12) while the function 
f(£) agrees with eq.(18): 



while the underdense limit implies from eq.(19) and eq.(20): 



f(C) - §c<- 1)/6 



for t a and any a. 



(19) 



Note that in Valageas (2002b) the factor 3/2 was replaced 
by 27/20 since we considered a critical-density universe (this 
again shows that the dependence of r(£) on cosmology is 
rather weak). As explained in Valageas (2002b), the asymp- 
totic behaviour (19) holds for rare underdensities whatever 
the value of a (hence it remains valid in the highly non-linear 
regime). The reason why the generating function (p{y) can 
still be written in terms of the spherical dynamics !F(8l,r) is 
that rare voids become increasingly spherical as they expand 
and they are not affected by shell-crossing yet. Of course, 
if one follows the evolution of a particular void after some 
time its expansion will be stopped or slowed down when it 
encounters more extreme voids (while filaments will form at 
their boundaries). This simply means that such a void is no 
longer part of the rare events described by eq.(19) which ap- 
plies to increasingly rare (but also more extended) voids as 
time goes on. As noticed in Valageas (2002b), for steep lin- 
ear power-spectra n > — 1 the radial profile of the spherical 
saddle-point which led to eq.(19) shows some shell-crossing 
at radius R. Although this behaviour is likely to modify the 
numerical factor 3/2 in cq.(19) we can expect the exponent 
to remain correct for n > —1 (on the other hand, note that 
for CDM-like power-spectra we have n <, —1 at the scales 
of interest). 



2.4 A simple model 

Gathering the results recalled in the previous sections we 
now build a simple model to describe the evolution of the pdf 
V(pr). Of course, we wish to recover both the quasi-linear 
regime presented in section 2.2 and the rare underdensities 
regime recalled in section 2.3. From eq.(10), we can see that 
if the generating function ip(y) is given by an implicit system 
of the form (11)-(12), then the generating function tp(y) is 
defined by the same system where ((t) is replaced by £(r) 
with: 




(20) 



Since in the quasi-linear regime the inverse t(Q has a sim- 
pler expression than C(r), see eq.(18), we work with t(() 
and we parameterize the generating function p(y) by: 



<p{y) 



dr 



(21) 
(22) 



In fact, we see from eqs.(21)-(22) that — t 2 (£)/2 is merely 
the Legendre transform of <fi(y)- Next, the quasi-linear limit 
implies from eq.(18) and eq.(20): 



a^0: t(C) 



2 s 2^ 



-l)/6 



(23) 



C^0: 




(24) 



In addition to these asymptotic behaviours, we also have 
the normalization constraints (2) or (4). From eqs.(21)-(22) 
it is easy to see that the conditions (2) are equivalent to the 
following constraints for r(£) (note that y = corresponds 
to r = and Q = 1): 



r(l) = 0, r'(l) = -l 



and r"(l) = y. 



(25) 



Indeed, note that the first normalization condition (1) = 1 
in (2), or ip(0) = 0, is automatically satisfied by the im- 
plicit system (21)-(22) for any function r(£), provided the 
transform <p(y) <-> r(£) it defines is regular around (y = 
0, if = 0) <-» (C = 1, t = 0). In order to obey the constraints 
(23)- (25) we consider the simple model: 



iC) = « + K 2 + cC ( " +3)/6 + ^/4c ( " 



-l)/6 



(26) 



where the coefficients a,b,c are set by the constraints (25). 
This is consistent with the underdense limit (24) and we 
recover the quasi-linear limit (23) provided we use for the 
skewness S3 the appropriate perturbative prediction S^ h '- 



0: S3 ^S® L =5-(n + 3), 



which implies with (25): 



: 



0, 6^0, 



3 



(27) 



(28) 



Note that one often chooses the value associated with a 
critical-density universe S^ L = 34/7 - (rt + 3) rather than 
the result (27) obtained for a "zero-density" open universe. 
We prefer to use (27) which is consistent with (18) and quite 
sufficient for our purposes. In other words, we neglect the 
small dependence of the skewness on cosmological parame- 
ters. Using eq.(25) we obtain a, b and c from: 



12S3 + 36- (l-n)(13-n)f«/-£- 
~~ (n + 3)(9-n) 



b _ l-n 3 / g 1 n + 3 

~~ 12 2 V a* 2 12 - C ' 
and: 




(29) 
(30) 

(31) 



2.5 Variance and skewness 

Finally, in order to fully determine the pdf V{pR) we need 
the variance £ and the skewness 5*3. For the variance we sim- 
ply use the fit provided by Peacock & Dodds (1996) for the 
evolution of the non-linear power-spectrum P(k). However, 
one could as well use any other model which fits the data, 
like those presented in Smith et al. (2003) . The fitting func- 
tions presented in Peacock & Dodds (1996) for P(k) depend 
on the slope n(feL/2) of the linear power-spectrum Pl(^l) 
at the "Lagrangian" wavenumber &l/2 (see their article for 
details). For the numerical results presented below in sec- 
tion 3 for a ACDM model, we define this local index n from 
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the logarithmic derivative of the linear power-spectrum at 
wavenumber kh rather than fcz,/2. This provides a better fit 
to the simulations we study here and it shows the correct 
limit at large scales (i.e. kh — ► k). 

Next, we need to specify the skewness S3. We choose a 
simple interpolation between the quasi-linear prediction S^ L 
and the highly non-linear prediction Sf EPT of the Hyper 
Extended Perturbation Theory (HEPT) presented in Scoc- 
cimarro & Frieman (f999): 



tion regime, both from analytical approaches or simulation 
results (note that EPT was fitted to numerical simulations) . 

2.6 Other models for the pdf V(p R ) 

In addition to numerical simulations, we shall also compare 
our model (26) with two other prescriptions which have been 
put forward in previous works. 



S3 = 



S^+t 5 S : 



HEPT 

3 



i+r 



•il. nHEPT o 4 

with 63 = 3 



2 n 



1 _|_ 2»+l ' 



(32) 



while Sg was defined in eq.(27). The skewness again de- 
pends on the local slope of the linear power-spectrum or 
the linear variance. As in Scoccimarro & Frieman (1999) or 
Colombi et al. (1997) we define the index n to be used in 
eq.(32) as the logarithmic slope of the linear variance a , 
as in eq.(16), at the Eulerian scale R. Note that this proce- 
dure is different from the one used to obtain the non-linear 
power-spectrum (which follows Peacock & Dodds 1996) but 
in agreement with previous works we found that it yields 
better results for the skewness. Next, we also choose for the 
index n which appears explicitly in eq.(26) the value we use 
for the skewness, obtained from eq.(16). 

It may seem a bit surprising to use a different pre- 
scription for n for the variance and the skewness but this 
should be seen as a result of the uncertainty on the variance 
£ (i.e. the non-linear power-spectrum). For instance, the fit- 
ting formulae for P(k) given by Smith et al. (2003), which 
are based on a halo model, define n from the linear variance 
as in eq.(16) but at the transition scale Ro where a — 1. Of 
course, all these prescriptions are identical in the case of an 
exact power-law linear power-spectrum. From the point of 
view of this article, these points are not part of the model 
we propose for the evolution of the pdf V(pr). Our model is 
fully defined by eq.(26) and it provides P(pr) once we are 
given the variance £, the skewness S3 and the local slope n. 
Thus, the user could choose any interpolation formula for £ 
or S3 as long as it agrees reasonably well with the data. 

For simplicity we chose to interpolate between the 
quasi-linear and HEPT predictions for S3 using the simple 
variable £ . From eq.(26) one might have thought using 
the ratio £,/cr 2 . However, this would not be convenient be- 
cause the behaviour of this ratio depends on the slope n of 
the power-spectrum. Thus, if the stable-clustering Ansatz is 
valid this ratio would go to zero at small scales if n > — 2 
while it would go to infinity if n < —2. Even though the 
stable-clustering Ansatz may not be exact such a large range 
of behaviours can be seen in numerical simulations (e.g., 
Fig. 3 in Valageas et al. 2000). By contrast, £ > 1 always 
marks the transition to the non-linear regime. Of course, 
other choices than (32) are possible. For instance, one could 
integrate the interpolation formula given by Scoccimarro & 
Couchman (2001) for the bispectrum. Their interpolations 
are similar to eq.(32) except that they use a 2 rather than 
£. We tried both variables for our present purposes and we 
found that £ with the power 1.5 worked best as compared 
with the numerical simulations. However, such interpola- 
tions are not fully satisfactory as seen below in Fig. 2. The 
skewness still shows a rather large uncertainty in the transi- 



2.6.1 Extended Perturbation Theory 

In Colombi et al. (1997) it was proposed to extend the quasi- 
linear prediction (18) into the non- linear regime by leaving 
n as a free parameter which deviates from the local slope 
of the linear variance at small scales. Therefore, this model, 
dubbed Extended Perturbation Theory (EPT), amounts to 
write: 



EPT: r(C) = --C 



(n e ff +3)/6 



3„ 
+ 2 C 



(«eff-l)/6 



(33) 



where n e g(n, cr) can be obtained from S3. Colombi et al. 
(1997) also gave the following fit for n e s' 

. x a (riNL-n) 2, 2m /o„\ 
— — wltn ^ = exp[log 10 (a /(J )J, (34) 



n c ff 



and: 



x a +x~ 



riNL(n) = 



3(n- 



3 + n 



8 - 3n 
10 



. logio(^o) 



2-n 
10 ' 



(35) 



Note that a significant difference between EPT and our 
model (26) is that in the highly non-linear regime if the 
stable-clustering Ansatz is valid (whence S3 is constant with 
time at fixed physical scale) the function r(Q given by EPT 
is fixed while it keeps evolving in our model because of the 
explicit factor \JTJa^ which translates the continuing ex- 
pansion and merging of rare voids. In particular, since the 
low-£ limit of eq.(33) is different from the exact result (19) 
when a 3> 1, the EPT prediction should fail for the low- 
density cutoff of the pdf P(pr) in the non-linear regime, see 
section 3.3 below. 



2.6.2 Lognormal model 

A second popular model used to describe the evolution of 
the pdf P(pr) is the lognormal approximation (e.g., Kayo 
et al. 2001). This provides a simple expression for the pdf 
P(Pr) itself which reads: 



Piu(pr) = 



■. exp 



In 5 'ipRVl+l] 



p fl V2vrln(l + """" V 21n(l +£) 
This also gives for the skewness S3: 
S 3 = 3 + f. 



(36) 



(37) 



Of course, in the limit a — > the lognormal pdf Vin(pR) goes 
to the usual Gaussian. However, the coefficients S p it defines 
(i.e. its cumulants) do not match the exact quasi-linear result 
recalled in section 2.2. Moreover, it does not agree with the 
low density limit presented in section 2.3. Therefore, we can 
expect a significant discrepancy with numerical simulations 
at the low-density cutoff in the non-linear regime, see also 
section 3.3 below. 
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l[in h ^pc] 
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1 10 
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Figure 1. The variance computed from the simulations is com- 
pared with the analytical predictions (which follow Peacock & 
Dodds 1996, see text) over a logarithmic scale. The solid dots are 
results from numerical simulations whereas lines are analytical 
predictions. Curves and dots from top down to bottom are for 
redshifts z = 0, z = 1 and z = 2. 



3 COMPARISON AGAINST NUMERICAL 
SIMULATIONS 

We describe in appendix A the numerical implementation 
of our model for V{pn)- In particular, we explain how 
to deal with the singularity y a of the generating function 
<p{y) which appears on the negative real axis. The code 
to evaluate V(pn) can be downloaded from http://www- 
spht.cea.fr/pisp/valag/codepdf-en.html. 

On the other hand, we investigate the robustness of our 
model in appendix B. That is, we study the dependence 
of our results on the functional form (26) we chose for t(£), 
which was not fully determined by the quasi-linear and rare- 
void limits. 

Then, we compare in this section the predictions of our 
model with the results obtained from numerical simulations. 



3.1 Simulation properties 

We have used the freely available intermediate scale ACDM 
N-body simulation from Virgo Consortium* to test our ana- 
lytical predictions. The cosmological parameters are largely 
compatible with recent observations, in particular we have 
fi m = 0.3, £Ia = 0.7, the Hubble constant is Ho = 70 
km/s/Mpc and os = 0.9. The box size is Lbox = 239.5h~ Mpc 
while the number of particles is N p = 256 3 . We used three 
epochs at redshifts z = 0,1,2 to compare our predictions 
with the simulations. We refer the reader to Jenkins et al. 
(1998) and Thomas et al. (1998) for more details about the 
simulations. The huge dynamic range studied by the VIRGO 
simulation provides an unique opportunity to probe the 



length scales from the quasi-linear regime up to the highly 
non- linear regime. 

To confront analytical results with numerical simula- 
tions we have used a cubic 256 3 grid. Then, we select cu- 
bic cells with volumes l 3 rid , 8ig rid , 64/ g , rid and 512ig rid . We 
compute the occupancy of each cell and the resulting count 
probability distribution. To increase the sampling random 
shifts were given to the 3D grid in orthogonal directions. 
This whole process was repeated several times to reach the 
number of cells which is required to probe low levels of prob- 
ability. Note that an alternative method which is equivalent 
to infinite sampling is developed in Szapudi (1998). Next, 
we compare these numerical results with our analytical pre- 
dictions. Note that our analytical pdfs apply to spherical 
cells (for which exact analytical results can be derived in 
the quasi-linear and rare-void limits). Therefore, we com- 
pare the pdf obtained from the N-body simulation for cubic 
cells of volume I 3 with our prediction for spherical cells of 
radius R with the same volume: I 3 = 4nR 3 /3. We shall see 
that we obtain a good agreement which shows that such a 
change in the shape of elementary cells has no strong influ- 
ence. On the other hand, spurious effects such as the finite 
size of the simulations, shot noise and sampling errors need 
to be kept in mind while comparing analytical and simula- 
tion results. 

Finite volume effects are due to the fluctuations of the 
underlying density perturbations on scales larger than the 
simulation box. Typically the tails of the count in cell (CIC) 
statistics are affected by the finite size of the catalogue. Such 
effects have been a very active area of research in recent past 
(Szapudi et al. 2000; Colombi et al. 2000) and much of the 
effort has been concentrated on trying to understand how 
they can be "corrected" so that meaningful statistical indi- 
cators can be constructed. Thus, it is known that the high- 
density far tail is very sensitive to the presence (or absence) 
of rare clusters in the catalogue. This yields random fluctu- 
ations before the pdf shows an abrupt cutoff corresponding 
to the densest cell in the finite catalog. This can also give 
a biased estimation for low order moments of the pdf such 
as the variance and the skewness. Of course, these problems 
worsen for smaller catalogs. In our studies we have used of 
the order of 10 10 cells of various sizes. This means that we 
can construct the probabilities Pn down to 10 -10 . We have 
avoided diluting the sample and we have used the full set 
of 256 3 particles from the simulations in our construction of 
the pdf. 

Discreetness effects are due to the sampling of the un- 
derlying continuous density field with finite point sets. How- 
ever, techniques to subtract the Poisson shot noise from com- 
puted quantities are well known (see Munshi et al. 1999a for 
expressions regarding the shot noise contributions to under- 
lying cumulants of smoothed field from central moments and 
factorial moments of cell count statistics, see also Szapudi & 
Szalay 1996). It is also possible to construct the analytical 
discrete pdf by convolving it with the Poisson sampling: 



A T,f \ (PRN) N -p R N 

dp R V( P r) j^— e PR , 



N = nV, 



(38) 



http: / /www. mpa-garching.mpg.de/Virgo/ 



where N is the mean number of particles in a cell (and n 
is the mean number density of particles in the simulation 
box) . However, in most of our cases we found that this cor- 
rection is not required as theoretical predictions are already 
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in good agreement with numerical data. On the other hand, 
since the lowest overdensity pR = N/N that one can probe 
corresponds to N = 1, which yields pR = 1/N, larger cells 
which have a larger N probe farther into the low-density tail 
of the pdf. For the computation of low order moments we 
relied on factorial moments which can directly be related to 
low order moments of the underlying continuous field: 

^ n(n - 1) (Af -p+d Pn= r dM nn) Am 

N ^ Jo 

For a complete discussion of generating functions, order by 
order expansions of low order factorial moments and their 
link with the normalized cumulants parameters Sn see Mun- 
shi et al. (1999a). 



3.2 Variance and Skewness 

We first compare in Fig. 1 the variance £(R, z) we obtain 
from the procedure described in section 2.5 (following Pea- 
cock & Dodds 1996) with the numerical simulations. We 
can see that the agreement is quite good over all scales and 
redshifts we investigate in this paper. Therefore, this simple 
procedure appears to be sufficient for our purposes. Next, we 
display in Fig. 2 the predictions for the skewness Ss(R,z) 
derived from our interpolation (32) between the quasi-linear 
limit and HEPT (Scoccimarro & Frieman 1999), from the 
EPT recalled in section 2.6.1 and from the lognormal ap- 
proximation (37), as compared with N-body simulations. 
We can check that all three models are very close on quasi- 
linear scales and match the simulations. Note that for the 
lognormal model this is actually a coincidence because its 
skewness only agrees with the exact quasi-linear limit for 
n = —1 (compare eq.(37) with eq.(27)) and it happens that 
on the quasi- linear scales shown in Fig. 2 we have n ~ — 1 
(as is usual for CDM power-spectra at such redshifts). On 
the other hand, we find that our interpolation and EPT 
prediction work rather well on non- linear scales at z = 1, 2 
while the lognormal model shows a significant discrepancy. 
At z = our interpolation shows some deviation from the 
simulations but it fares much better than both EPT and 
the lognormal model. However, the comparison of the simu- 
lations at z = 0, 1 and 2 suggests that on small scales finite 
resolution effects play a significant role (indeed, at fixed co- 
moving scale one expects the skewness to grow with time, 
which is not the case in Fig. 2 below a few grid lengths). 
Therefore, some of the discrepancy between numerical sim- 
ulations and the models could be due to such numerical 
effects. 



3.3 Density pdf V{p R ) 

Finally, we compare in Figs. 3-5 the pdfs V(pr) obtained 
from the different theoretical models with the results from 
N-body simulations. We display our model (26), EPT (33) 
and the lognormal model (36) at redshifts z = 0, 1 and 2 and 
for four scales. We first plot PrP(pr) (in logarithmic scales) 
which is the quantity of interest for most practical purposes. 
Indeed, PrV(pr) is also the fraction of matter per logarith- 
mic interval of overdensity dln(pfl) at scale R. Moreover, 
it enables one to clearly see the evolution of the pdf into 



the non-linear regime. Thus, we see that while in the quasi- 
linear regime there is only one density scale: the mean den- 
sity of the universe (that is pR = 1 in our units) and the pdf 
'P(Pr) tends to a simple Gaussian, as one goes deeper into 
the non-linear regime two density scales gradually appear. A 
low-density scale p„<l marks the low-density cutoff of the 
pdf, below which one enters the rare-voids regime recalled 
in section 2.3. As seen in Valageas (2002b), this cutoff p v is 
given by: 

a > 1 : Pv = a(R)- 6/{1 - n) « 1, (40) 
and the pdf V(pr) shows the modified exponential falloff: 

PA«P, : P(pfl) ~e- 9p H <1_ " )/3/(8CT2) . (41) 

Note that in the quasi-linear regime (i.e. a <C 1) the low- 
density falloff of V(pr) is still given by eq.(A7) but the den- 
sity scale p v goes to unity. This part of the pdf is governed 
by the last term in eq.(26), that is by the behaviour (24). A 
second density scale ph which marks the high-density cutoff 
of the pdf V(pr) is simply given by the two-point correlation 

I- 

Ph = s 3 1 = i4r » L ( 42 ) 

{Pr)c 

It corresponds to the typical overdensity within virialized 
halos at scale R while higher densities are associated with 
rare massive objects. The reason why we defined ph in 
cq.(42) from the second and third cumulants of the density 
pdf is that they are the lowest order cumulants which are 
mostly sensitive to the high-density part of the pdf. Thus, 
we can see from the figures that the mean ( P r) = 1 probes 
the whole range p v < pr < ph and we can check that at 
small scales the cutoff ph can indeed be significantly higher 
than £ (compare Fig. 3 with Fig. 1) following the growth 
of the skewness S3. On the other hand, higher-order cu- 
mulants probe increasingly large and rare overdensities and 
exhibit higher uncertainties. As for the low-density cutoff 
p v , the high-density cutoff ph goes to unity in the quasi- 
linear regime so that both density scales merge to the mean 
density of the universe. 

We can see in Figs. 3-5 that our simple model (26) 
agrees very well with the results from numerical simulations. 
Note moreover that our model has no free parameter beyond 
the variance (obtained from Peacock & Dodds 1996) and 
the skewness. The EPT model recalled in section 2.6.1 is 
also fully parameterized by £ and S3 , but we can see that it 
shows significant discrepancies at small scales. This can be 
traced to its low-density behaviour which does not obey the 
properties (24) and (A7). Indeed, it yields a much sharper 
low-density cutoff. In order to fulfill the normalization con- 
ditions (2) this implies a high peak at p v (to compensate for 
the smaller contributions from lower densities to the normal- 
ization (1) = 1) and a lower plateau at intermediate densities 
(to compensate for this high peak at p v into the normaliza- 
tion (pr) = 1). This failure to follow the evolution of the 
pdf into the non-linear regime comes from the fact that the 
EPT model (33) attempts to parameterize the evolution of 
rare underdensities and overdensities in the same fashion. 
That is, both the small £ (£ <C 1) and large £ (£ > 1) parts 
of the function t(Q are parameterized by the same param- 
eter n e ff. In particular, if the stable-clustering Ansatz were 
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Figure 2. The skewness 53 computed from the simulations is compared with analytical predictions at redshifts z = 0, 1, 2. Solid dots are 
results from numerical simulations whereas various lines are analytical predictions as labeled in the left panel. We display our interpolation 
eq.(32) (P), the EPT prediction from section 2.6.1 (EPT) and the lognormal approximation cq.(37) (Ln). 




Figure 3. Analytical and numerical probability distribution func- 
tions are plotted for various smoothing scales I (in grid units) as 
indicated in each panel. Lines of various styles represent analyt- 
ical predictions from various theoretical models as labeled in the 
upper left panel. We show our model (26) (P), the EPT pre- 
diction (33) (EPT) and the lognormal approximation (36) (Ln). 
Dark solid lines are the results from numerical simulations. We 
have plotted p R V{pn) vs p^j (on logarithmic scales) which shows 
clearly the evolution of the pdf into the non-linear regime. 



valid (that is 5*3 goes to a constant with time in the non- 
linear regime at fixed physical scale) the function t(Q from 
EPT would become constant. However, it is clear that this 
would miss the physics at work at low densities. Indeed, even 
though high-densities are governed by virialization processes 
which might stabilize high-density peaks observed in phys- 
ical units, rare voids are governed by different phenomena 
and keep expanding faster than the mean universe. This is 



Figure 4. Same as previous figure but for z=l. 



captured by the last term in our model (26) which shows 
the explicit time dependence £/<r 2 . This point is further dis- 
cussed in section 3.6.3 in Valageas (2002b). Note that nei- 
ther our model nor EPT assume that the stable-clustering 
Ansatz is valid but this discussion enables one to clearly see 
that rare underdensities and overdensities require distinct 
treatments. 

The simple lognormal model shows a smooth parabolic- 
like shape in the scales used in Figs. 3-5 (actually log [pnP(pn)] 
would give an exact parabola over log pr) and it might ap- 
pear to work better than EPT. However, we can see that our 
model (26) provides significantly better results as compared 
with numerical simulations. Indeed, the lognormal model 
yields low-density and high-density falloffs which are clearly 
too shallow and it does not follow very well the flat plateau 
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Figure 5. Same as previous figure but for z=2. 



in-between p v and ph because of its parabolic shape (the cur- 
vature is too high). The rather good success of our simple 
model (26) is due to its correct description of the low-density 
cutoff (as compared with EPT) and the normalization con- 
straints (2). Indeed, the rare- void limit (24) together with 
the normalization (1) = 1 set the location and the height 
of the low-density cutoff V(p v ) (as well as lower densities). 
Next, the normalizations {pr) c = £ and (Pr) c = set 
the location and the height of the high-density cutoff V(ph) 
(see eq.(42)). Finally, the normalization (pr) — 1 provides 
a further constraint on the pdf V(pr) over the whole range 
pv < Pr < Ph- All these conditions are sufficient to provide 
a tight constraint onto the pdf V(pr) at densities pR < p^- 
This property is obviously a direct consequence of the weak 
dependency of the functions r(£) and tp(y) onto additional 
free parameters discussed in appendix B once all constraints 
(23)-(25) are taken into account. 

It might be possible to have a significantly different pdf 
'P(Ph) while satisfying the constraints (23)-(25) if we allow 
large oscillations in the intermediate range p v < Pr < Ph- 
However, this would require a rather contrived model and 
Figs. 3-5 show that results from N-body simulations do not 
exhibit such peculiar behaviour. On the contrary, although 
the numerical simulations may not go sufficiently far into the 
non-linear regime to draw definite conclusions, it seems that 
no intermediate density scale shows up in the range p v < 
Pr < ph and the pdf exhibits a smooth power-law behaviour 
over this range. In this sense, the outcome of gravitational 
clustering appears quite simple. Therefore, we can conclude 
that our model is very robust over the whole range pR < ph- 
Moreover, its prediction for V(pr) is the simplest one which 
can be made consistent with all known constraints. 

Any deviation from our result over this range (in fact 
over p v < pR < ph since lower densities are known from sec- 
tion 2.3) would require additional parameters which would 



for instance introduce new density scales. An alternative 
possibility would have been a bimodal distribution. For in- 
stance, in a fashion similar to the behaviour shown by EPT 
in Fig. 3, one could have imagined keeping the only two char- 
acteristic density scales p v and ph but having a specific 
power-law in the intermediate range which only connects 
smoothly to one extremity (for instance ph) while the match- 
ing at the other hand (then p v ) involves a sharp transition. 
Another possibility would have been two distinct power-laws 
attached to each boundary {pv,Ph) which match in-between. 
It is not clear whether such behaviours could have been ruled 
out a priori. However, one may argue in this direction as fol- 
lows. If we assume that the stable clustering Ansatz is valid, 
virialized objects are frozen in physical coordinates so that 
the high-density part of the pdf remains constant. Then, 
the intermediate power-law range only grows towards un- 
derdensities as increasingly underdense and extreme voids 
(as measured in the initial conditions) fill the whole volume 
and see their fast expansion stopped as they join. In this 
process, the matter at their boundaries forms filaments and 
virialized objects which become part of the matter described 
by the power-law regime while the mass associated with rare 
underdensities below p v keeps declining. Then, since at the 
beginning of this process the two density scales p v and ph 
coincide and the process repeats itself identically with time 
it is natural to expect the build-up of a unique power-law 
regime which smoothly connects both end-points {pv, ph) 
(while a double power-law could have been expected if the 
matter in this range would originate both from p v and ph, 
the flux from one extremity increasing as one gets closer). In 
practice, we do not expect such a stable clustering Ansatz 
to be valid, as moderate overdensities experience mergings 
and tidal effects. Nevertheless, the scale-free nature of grav- 
ity and the approximate self-similarity of the process, which 
starts from a unique origin (p v — ph), could still be the 
source of the simple behaviour seen in Figs. 3-5. 

We must point out that this intermediate density range 
actually corresponds to the part which is most difficult to 
follow from a Lagrangian point of view. Indeed, it covers a 
wide range of objects (about three orders of magnitude over 
the density in the upper panels of Fig. 3), from low-density 
filaments which surround large voids up to typical virialized 
halos. The common property of these objects is that they 
have undergone strong interactions with the neighbouring 
density fluctuations and strong mergings. This makes it dif- 
ficult for simple Lagrangian mappings (which attempt to 
map the linear density contrast Sl to the actual non-linear 
density contrast 8, possibly with some scatter) to model this 
density range. It is clear that our approach is completely 
different from this point of view as we do not try to iden- 
tify non-linear density fluctuations from the linear density 
field on a one-to-one basis. On the contrary, we directly fo- 
cus on the pdf V(pr), which is a statistical quantity, and 
we try to build a simple model which satisfies all known 
constraints. As seen above, it appears that this is actually 
sufficient to derive the pdf V(pr) over this density range up 
to a good accuracy. As noticed above, the reason for this 
pleasant result is that the intricate merging process which 
"shuffles" the matter associated with these typical density 
fluctuations does not bring about new scales and it yields 
a simple power-law behaviour in-between the low and high 
density cutoffs p v and ph- Therefore, while the gravitational 
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dynamics of these individual objects is extremely complex 
(they may even lose their identity through mergings or dis- 
ruptions) the statistical outcome is very simple. This sug- 
gests that the appropriate method to investigate this system 
in a theoretical and rigorous manner from the equations of 
motion should rely on a statistical analysis. 

On the other hand, we must point out that the very 
high-density regime pr S> pn is not constrained by the con- 
ditions (2) which allowed us to obtain V(pr) for pR < pn 
with a good accuracy. As recalled in eq.(Al), our model usu- 
ally yields a singularity y s < for the generating function 
f(y) (at least for n < 0) which translates into a simple expo- 
nential cutoff for the high-density tail of V(pr), see eq.(A2). 
Nevertheless, we must note that this behaviour should not 
be taken at face value and the very high-density tail could 
exhibit a different shape. As seen in Valageas (2002a) (sec- 
tion 3.6), this is actually the case in the quasi-linear limit 
where the singularity y s is somewhat spurious and one must 
follow a second branch of tp(y) beyond y s which yields a high- 

_ (re + 3)/3, 2 

density tail V(pr) ~ e p r /<t which is shallower than 
a simple exponential for n < 0. As discussed in Valageas 
(2002b) one can expect a similar behaviour at very high 
densities in the non-linear regime. However, at the scales 
and redshifts shown in Figs. 3-5 it appears that our model 
works reasonably well up to the highest densities probed 
by the numerical simulations and even higher densities are 
probably too rare to be of practical interest. On the other 
hand, in the quasi-linear regime our model goes to the exact 
quasi-linear limit (like EPT). As seen in Valageas (2002a), 
in this regime too the far tail where the deviation from the 
simple exponential cutoff discussed above could be seen cor- 
responds to very rare events which are usually irrelevant. 

Finally, we show for completeness in Figs. 6-8 the pdf 
V(Pr) from the various theoretical models and the N-body 
simulations. This emphasizes the low-density cutoff p v but 
the high-density cutoff ph is blurred by the huge vertical 
scale and cannot be easily distinguished from the intermedi- 
ate power-law part. Of course, as in Figs. 3-5 we can check 
that our model (26) works best, while EPT yields too sharp 
a low-density cutoff at a larger density and the lognormal 
model gives too shallow a cutoff at a smaller density. 



4 CONCLUSION 

Thus, in this paper we have presented a new model to de- 
scribe the evolution of the density probability distribution 
function V{pFt). This allows us to follow the dynamics of 
gravitational clustering on cosmological scales from the lin- 
ear regime up to the highly non-linear regime. Taking advan- 
tage of the known rigorous results which have been obtained 
in the quasi-linear limit and the rare underdense limit, we 
have built the simplest model which satisfies both these con- 
straints as well as normalization conditions. Our model is 
fully parameterized by the non-linear variance and the skew- 
ness. Using standard estimates for these two quantities we 
have shown that our predictions match N-body simulations 
over the scales and redshifts of interest. Moreover, we have 
found that our model works significantly better than previ- 
ous approximations (EPT and the lognormal). 

We have explained that this success is due to two prop- 
erties: i) the correct handling by our model of the rare void 
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Figure 6. Analytical and numerical probability distribution func- 
tions are plotted for various smoothing scales I (in grid units) as 
indicated in each panel, over logarithmic scales. Lines of vari- 
ous styles represent analytical predictions from various theoreti- 
cal models, while the numerical data is shown by dark solid lines, 
as in Fig. 3. 
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limit and ii) the simple statistical outcome of gravitational 
clustering over the range spanned by typical fluctuations. 
Indeed, following previous works (e.g., Balian & Schaeffer 
1989, Colombi et al. 1997, Valageas 1998,1999) we have 
checked that in the non-linear regime two distinct density 
scales appear (which merge to the mean density of the uni- 
verse p in the linear regime): a low density cutoff p v <p 
and a high-density cutoff p h 2> p. The low density cutoff 
p v marks the transition to rare voids which keep expanding 
faster than the Hubble flow while the high density cutoff 
Ph marks the transition to rare overdensities. The inter- 
mediate range p v < pR < ph corresponds to typical den- 
sity fluctuations which have undergone mergings and tidal 
disruptions and spans a large variety of objects: from fil- 
aments surrounding large voids to moderate virialized ha- 
los. These structures cannot be followed by Lagrangian ap- 
proaches since they have no well defined identities (because 
of mergings and disruptions) but it appears that the sta- 
tistical outcome of this intricate gravitational dynamics is 
quite simple: the pdf obtained from numerical simulations 
exhibits a simple power-law behaviour over the whole range 
Pv < Pr < Ph (i.e. no additional density scale appears). 
Then, this simple behaviour together with the rare- void limit 
and the normalization conditions provide a tight constraint 
on the pdf V(pr) over pR < ph which can thus be predicted 
with a good accuracy. This also explains why it is possible to 
build a satisfactory model using only £ and S3. As described 
in section 3.3, these two quantities provide the location pn 
(see eq.(42)) and the height 'P(ph) of the high-density cutoff. 
Then, a simple power-law matching to the low-density cut- 
off p v (whose location and height are explicitly known as a 
function of a 2 ) allows a complete description at all densities 
below (and also of the order of) ph- 

On the other hand, we have noticed that there are no 
rigorous results for the very high-density limit pR 3> pn 



so that our model should be viewed with some caution in 
this regime. In particular, although our simple model usu- 
ally yields an exponential tail at high densities we explained 
that one could expect a more general shape (the exponen- 
tial of some power-law) as discussed in Valageas (2002b). 
If this high-density falloff were sharper than exponential it 
could be handled within our framework in a straightforward 
way by modifying the large-£ part of the function t(£) we 
used in eq.(26) and making sure it yields no singularity. By 
contrast, if this high-density cutoff were shallower than ex- 
ponential it would require a more important modification 
(see also Valageas 2002a for a similar case). A possible way 
to do so would be to consider the pdf and the cumulant gen- 
crating function associated with the logarithm of the density 
ln pr rather than the density pR itself. Here we may note 
that going to higher orders over S v (i.e. adding further con- 
straints to (2) by including higher-order moments) does not 
appear to us to be the most promising way to improve this 
model. Indeed, it would add some further parameters which 
are not accurately known without ensuring that the large 
density limit is correct. In our view, one would rather like 
to derive the large-density behaviour of the pdf 'P(pr), or 
of the generating function t((), and build a simple model 
for r(C) which recovers this asymptotic behaviour (as we 
did for the underdense limit) . Unfortunately, this behaviour 
has not been derived yet (although one might try a version 
of the Press-Schechter (1974) prescription coupled to the 
stable-clustering Ansatz or some halo model). As discussed 
above, in such a case it could be more convenient to work 
with ln pr. 

However, we shall not explore this point further in this 
paper because we have found that our model works quite 
well over the scales and redshifts of interest. Moreover, the 
very high-density tail pR S> ph where such deviations might 
appear is beyond the reach of these simulations and it corre- 
sponds to very rare overdensities which should be irrelevant 
for most practical purposes. In particular, the robustness of 
our model discussed above for pR < ph ensures that we ob- 
tain good predictions for density fluctuations which occupy 
most of the volume and contain most of the matter of the 
universe. Besides, we can note that the constraint provided 
by the skewness actually means that we correctly describe 
the near high-density tail. 

Therefore, we expect that this simple model should be 
useful for cosmological studies which require a realistic esti- 
mate of the probability distribution V(pr), from rare voids 
up to rare overdensities and from the linear regime up to 
the highly non-linear regime. In fact, this model is proba- 
bly the simplest one which can be built consistently with 
all known results. It also suggests that a theoretical study 
of gravitational clustering on non-linear cosmological scales 
should rely on a statistical analysis rather than a detailed 
Lagrangian approach, as the properties of the system seem 
to be much simpler within a statistical perspective. However, 
such an analysis still remains to be done. 

Finally, although we have focused on the one-point pdf 
in this paper, it is possible to extend our approach to analyze 
the bias associated with overdense cells and the detailed de- 
scription of their multivariate distribution. A detailed anal- 
ysis in line with Munshi et al. (1999b) and Bernardeau & 
Schaeffer (1999) will be presented elsewhere. 
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Figure Al. The integration path over the complex y— plane used 
to compute the pdf 'P(pr) from eq.(A3). We show the paths ob- 
tained at z = and at the scale £ = 4Z gr i<j for pR = 100, 10 and 5 
(this corresponds to the lower left panels in Figs. 3,6). 
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APPENDIX A: NUMERICAL 
IMPLEMENTATION 

We have described in section 2 two models which yield the 
pdf V(pr) through its cumulant generating function ip(y), 
the latter being determined through the implicit system 
(21)-(22). We explain in more details here how to compute 
numerically 'P(pr) within this framework. As is well-known, 
in the context of large scale structures the functions r(£) 
which appear in the implicit system (21)-(22) usually yield 
a singularity y a on the negative real axis for <p(y) because 
the function r(y) is bivaluate (e.g., Bernardeau & Schaeffer 
1992, Bernardeau 1992, Valageas 2002a). This means that 
the parametric representation (21)-(22) for f(y) yields two 
branches (obtained for r > r s and r < t s ) which join at 
y s , as seen for instance in Figs. 3-5 in Valageas (2002a) 
(see also Figs. B1-B2 below). From the definition (3)-(5) of 
<p{y), the branch of interest is the one which runs through 
(y = 0, (p = 0) and (C = 1,t = 0), where we have the 
Taylor expansion tp(y) = y — y 2 /2 + .. (as we noticed be- 
low eq.(18) the linear term y is usually removed in studies 
which focus on the quasi-linear regime by working with 8r 
rather than pn). As explained in Valageas (2002a), the non- 
perturbative steepest-descent approach shows that in the 
quasi-linear limit the second branch of tp(y) is actually rel- 



evant as it governs the high-density tail of the pdf V(pr). 
However, for our present purposes it is irrelevant because 
our perspective is quite different. In this paper we do not 
compute V(pr) from the equations of motion, we merely 
build a phenomenological model for the pdf. Hence we actu- 
ally define (p(y) from the implicit system (21)-(22) so that 
the singularity y a is no longer an artifact but an element of 
the definition of ip(y). As described in Colombi et al. (1997), 
one obtains from eqs.(21)-(22) the behaviour of <p(y) near its 
singularity as: 



y >ys ■ f(y) = <£ s + r s (y - y s ) + a a (y - y a ) 



3/2 



+ ■ 



(Al) 



which implies through the inverse Laplace transform (6) the 
high-density behaviour: 



Pr > 1 + £ ■ V(pr) oc p R 



5/2 e -|w.|f>H/C 



(A2) 



In practice, we must ensure that the numerical computa- 
tion of the inverse Laplace transform (6) does not cross the 
branch cut y < y s . Moreover, as explained in Colombi et al. 
(1997), it is important to choose the integration path in the 
complex y-plane such that the argument of the exponential 
in eq.(6) is real, which avoids oscillations and ensures a fast 
convergence of the integral. To do so, one starts on the real 
axis at y — y c where y c is the saddle-point of the expo- 
nent (the path is symmetric with respect to the real axis). 
However, as recalled above we must not cross the real axis 
below y s , hence for y c < y s one usually starts the integra- 
tion from y 3 (e.g., Colombi et al. 1997, Munshi et al. 2004). 
However, this procedure is not fully satisfactory, especially 
in the highly non-linear regime. We found that it is more 
efficient to first perform two integrations by parts in eq.(6) 
which yields: 

V ( PR ) = J_ / l e [P«-»«^/«-^]/{, (A3) 
Then, the saddle-point is given by: 

P^-J ^-=^fi ^ y c . (A4) 

Now the saddle-point y c "feels" the existence of the singu- 
larity y s and always obeys y c > y„. Indeed, from eq.(Al) 
one can easily see that we have the asymptotic behaviour: 



PR 



yc - y s 



PR 



(A5) 



which automatically gives the high-density behaviour (A2) 
from eq.(A3). Note that a single integration by parts would 
also give y c > y s but it would yield the spurious scaling 
(y c — y a ) ~ (£/pr) 2 which would naively imply a power- 
law prefactor p R 3 instead of p~^ 5 ^ 2 in eq.(A2). Of course, in 
principle any procedure (with none or several integrations by 
parts) must give the same results but those which are not 
well-suited to the singular behaviour (Al) imply a higher 
computational cost for a given accuracy. On the other hand, 
the underdense regime (pr — > and y c — » +oo) shows no 
particular problem. As seen in Valageas (2002b) the exact 
asymptotic behaviour (19), which is verified by our model, 
leads to: 



V -» 
and 



+oo : ip{y) <x y 



(l_„)/(4-„) 



(A6) 
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Figure Bl. The function y[Q obtained at z = and at the scale 
I = Zgrid (this corresponds to the upper left panels in Figs. 3,6). 
We show the results derived with d = 1 (dashed line), d = 2 (solid 
line) and d = 3 (dotted line). 

n -n/ \ (n-13)/6 -9p~ (1_ " )/3 /(S<T 2 ) , « „n 

/9 H -> : V(pR) qc p' R e M ; . (A7) 

The numerical factor 9/8 in eq.(A7) was changed to (27/20) 2 /2 
in Valageas (2002b) where we considered a critical-density 
universe. As noticed in section 2, this dependence on cos- 
mology can actually be neglected. Moreover, one easily see 
from eq.(6) that the asymptotic behaviour (A6) implies 
V{pr) = for pr < (by pushing the integration path 
over y to $l(y) — > +oo). 

We show in Fig.Al the integration paths over the com- 
plex y— plane used to compute the pdf V(pr) from eq.(A3), 
at z = and at the scale I = 4i gr id, for pR = 100, 10 and 5, 
which corresponds to the lower left panels in Figs. 3,6. For 
lower densities the integration path crosses the real axis at a 
larger y c (with y c — > +co for pr — > 0) while for high densi- 
ties it gets closer to the singularity y s < (with y c — * yf for 
Pr — * oo, see eq.(A5)) and it bends more closely along the 
branch cut (i.e. the negative real axis with y < y 3 ). More- 
over, for the same relative accuracy, for a higher density one 
only needs to integrate over a shorter length on y but with 
smaller steps. 



APPENDIX B: ROBUSTNESS OF THE MODEL 

In the simple expression (26) which defines our model, the 
last two terms are set by the quasi-linear limit and the low- 
density regime. However, the first two terms a + 6£ 2 are 
somewhat arbitrary. Indeed, although their normalization 
is given by the constraints (25) we could have used other 
powers or functions. Thus, we could replace for instance 
a + bC,' 2 by a + bC, d with any d larger than (n — l)/6 (so 
that the low-density limit ( — > remains correct). How- 
ever, we have checked through numerical computations that 
the dependence of our results on this free parameter is ac- 
tually negligible: the different curves we obtain for the pdf 
T'(pr) are almost indistinguishable in all cases described in 
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Figure B2. The cumulant generating function <p(y) obtained at 

2 = and at the scale I = Z gr id (this corresponds to the upper 
left panels in Figs. 3,6). We show the results derived with d = 1 
(dashed line), d = 2 (solid line) and d = 3 (dotted line). We 
also plot <p(y) for the lognormal model (36) (dot-dashed curve for 

y>o). 

section 3, where we compare our results with N-body simu- 
lations. The reason for this independence on the parameter 
d can be seen in Fig. Bl, where we display the function ?/(£) 
obtained from eqs.(22),(26). We consider the scale I — Z gr id 
(with Zg r id being the grid unit of the simulations) and the 
redshift z = 0, which is the most non-linear case (upper left 
panels in Figs. 3,6). We show in Fig. Bl the functions y(() 
obtained for a parameter d — 1,2 and 3, the exponent d — 2 
being our fiducial model as written in eq.(26). Then, we see 
that all three curves y(Q are very close for £ 1-5. Indeed, 
the low-density part £ < 1 goes to the rare-void limit de- 
scribed in section 2.3 and eq.(24) while the behaviour near 
f = 1 is constrained by (25). Next, we note that at £ s ~ 1.3 
the function y(Q reaches a minimum. This translates into a 
singularity for <p(y) since this means that £(y) and (p(y) be- 
come bivaluate. More precisely, near this minimum we have 
V — y s ~ (C - (s) 2 which yields ( - £ s ~ ±\y - y s \ 1/2 and 
the power 3/2 in eq.(Al) for ip{y). Then, as discussed in 
section A, this gives several branches for (p(y) and we must 
only keep the one which connects to the underdense regime, 
which corresponds to the part of y(Q with f < £ s . Since 
this singularity f s is close to 1 it is strongly constrained 
by eqs.(25) and the dependence on the parameter d of the 
function t(£) (or y(()) is very weak over this range C < C«- 
This behaviour can also be seen from Fig. B2 where we 
display the cumulant generating function ip(y) for the same 
cases as in Fig. Bl. The relevant branch of ip(y) which defines 
the pdf T(pr) is the one which runs through the origin y — 
and extends up to y — * +oo (extreme underdensities) . 
Again, we see that the three curves obtained for d = 1, 2 and 

3 are indistinguishable over this branch. By contrast, one can 
see some deviations appear as one follows the other branches 
down to y — > — oo (which corresponds here to £ — > +oo) but 
they have no signification for the pdf V(pr). 

For completeness, we also display in Fig. B2 the gener- 
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ating function ip(y) obtained for the lognormal model (36). 
It is restricted to y > since the integral (5) diverges for 
y < because the high-density tail of the lognormal de- 
creases more slowly than a simple exponential. In other 
words, we now have y s = 0. On the other hand, at very large 
y the lognormal approximation yields ip(y) ~ In 2 y contrary 
to the power-law (A6). Thus, one can already see on ip(y) 
the differences between various models for V{pn). 
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